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Abstract 

The  foundation  of  modeling  and  synthesizing  reactive  processes 
is  the  theory  of  graph  games  with  w-regular  winning  conditions.  In 
the  case  of  stochastic  reactive  processes,  the  corresponding  stochastic 
graph  games  have  three  players,  two  of  them  (System  and  Environ¬ 
ment)  behaving  adversarially,  and  the  third  (Uncertainty)  behaving 
probabilistically.  We  consider  two  solution  problems  for  stochastic 
graph  games:  a  qualitative  problem,  calling  for  the  computation  of  the 
set  of  states  from  which  a  player  can  win  with  probability  1  {almost- 
sure  winning),  and  a  quantitative  problem,  calling  for  the  computation 
of  the  maximal  probability  of  winning  (optimal  winning)  from  each 
state.  We  show  that,  for  Rabin  winning  conditions,  both  problems 
are  in  NP.  As  these  problems  were  known  to  be  NP-hard,  it  follows 
that  they  are  NP-complete  for  Rabin  conditions,  and  dually,  coNP- 
complete  for  Streett  conditions.  The  proof  proceeds  by  showing  that 
pure  memoryless  strategies  suffice  for  qualitatively  and  quantitatively 
winning  stochastic  graph  games  with  Rabin  conditions.  This  fact  was 
an  open  problem,  and  it  is  of  interest  in  its  own  right,  as  it  implies 
that  controllers  for  Rabin  objectives  have  simple  implementations.  We 
also  prove  that  for  any  w-regular  objective  optimal  winning  strategies 
are  no  more  complex  than  almost-sure  winning  strategies. 

‘The  work  was  supported  by  the  AFOSR  MURI  grant  F49620-00-1-0327,  by  the  ONR 
grant  N00014-02-1-0671,  and  by  the  NSF  grants  CCR-0132780,  CCR-0234690,  CCR- 
9988172,  and  CCR-0225610,  and  by  the  NSF  Career  grant  CCR-0132780,  the  NSF  grant 
CCR-0234690,  and  by  the  ONR  grant  N00014-02-1-0671 


1  Introduction 


A  stochastic  graph  game  is  played  on  a  directed  graph  with  three  kinds  of 
states:  player-1  states,  player-2  states,  and  probabilistic  states.  At  player-1 
states,  the  first  player  chooses  a  successor  state;  at  player-2  states,  the  sec¬ 
ond  player  chooses  a  successor  state;  and  at  probabilistic  states,  a  successor 
state  is  chosen  according  to  a  given  probability  distribution.  The  result  of 
playing  the  game  forever  is  an  infinite  path  through  the  graph.  If  there  are 
no  probabilistic  states,  we  refer  to  the  game  as  a  2-player  graph  game,  and 
otherwise,  as  a  2^l2-player  graph  game.  There  has  been  a  long  history  of 
using  2-player  graph  games  for  modeling  and  synthesizing  reactive  processes 
[1,  19,  21]:  a  reactive  system  and  its  environment  represent  the  two  play¬ 
ers,  whose  states  and  transitions  are  specified  by  the  states  and  edges  of  a 
game  graph.  Consequently,  2Y2-player  graph  games  provide  the  theoreti¬ 
cal  foundation  for  modeling  and  synthesizing  stochastic  reactive  processes 
[20,  13]. 

For  the  modeling  and  synthesis  (or  “control”)  of  reactive  processes,  one 
traditionally  considers  uj-regular  winning  conditions,  which  naturally  ex¬ 
press  the  temporal  specifications  and  fairness  assumptions  of  transition  sys¬ 
tems  [15].  This  paper  focuses  on  the  complexity  of  solving  2  Y2-player  graph 
games  with  respect  to  two  important  normal  forms  of  w-regular  conditions: 
Rabin  conditions  and  Streett  conditions  [23].  Rabin  and  Streett  conditions 
are  dual  (i.e.,  complementary),  and  their  practical  relevance  stems  from  the 
fact  that  their  form  corresponds  to  the  form  of  fairness  conditions  for  tran¬ 
sition  systems.  In  particular,  no  blow-up  in  the  system  representation  is 
required  when  encoding  fairness  as  a  Streett  condition,  or  dually,  in  the 
antecedent  of  a  temporal  specification,  as  a  Rabin  condition  [15]. 

In  the  case  of  2-player  graph  games,  where  no  randomization  is  involved, 
a  fundamental  determinacy  result  ensures  that,  given  an  w-regular  (or  indeed 
Borel)  condition,  at  each  state,  either  player  1  has  a  strategy  to  ensure  that 
the  condition  holds,  or  player  2  has  a  strategy  to  ensure  that  the  condition 
never  holds  [17].  The  problem  of  solving  2-player  graph  games  consists 
thus  in  finding  the  set  of  winning  states,  from  which  player  1  can  ensure 
that  the  condition  holds.  This  problem  is  known  to  be  in  NP  fl  coNP 
for  parity  conditions  [11],  to  be  NP-complete  for  Rabin  conditions  [12,  11, 
23],  and  consequently,  to  be  coNP-complete  for  Streett  conditions.  The 
proofs  of  inclusion  in  NP  rely  on  the  existence  of  pure  (i.e.,  deterministic) 
memory  less  strategies,  which  act  as  polynomial  witnesses.  The  existence  of 
pure  memory  less  winning  strategies  is  also  of  independent  interest,  as  such 
strategies  can  be  simply  and  effectively  implemented  by  a  controller. 
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In  the  case  of  2  Y2-player  graph  games,  where  randomization  is  present  in 
the  transition  structure,  the  notion  of  winning  needs  to  be  clarified.  Player  1 
is  said  to  win  surely  if  she  has  a  strategy  that  guarantees  to  achieve  the 
winning  condition  against  all  player-2  strategies.  While  this  is  the  classical 
notion  of  winning  in  the  2-player  case,  it  is  less  meaningful  in  the  presence  of 
probabilistic  states,  because  it  makes  all  probabilistic  choices  adversarial  (it 
treats  them  analogously  to  player-2  choices).  To  adequately  treat  probabilis¬ 
tic  choice,  we  consider  the  probability  with  which  player  1  can  ensure  that 
the  winning  condition  is  met.  We  thus  define  two  solution  problems  for  21/2- 
player  graph  games:  a  qualitative  one,  which  asks  for  the  computation  of  the 
set  of  states  from  which  player  1  can  ensure  winning  with  probability  1,  and 
a  quantitative  one,  which  asks  for  the  computation  of  the  maximal  probabil¬ 
ity  with  which  player  1  can  ensure  winning  from  each  state  (also  called  the 
value  of  the  game  at  a  state)  [9,  8].  Correspondingly,  we  define  almost-sure 
winning  strategies,  which  enable  player  1  to  win  with  probability  1  when¬ 
ever  possible,  and  optimal  strategies,  which  enable  player  1  to  win  with 
maximal  probability.  The  main  result  of  this  paper  is  that,  in  21/2-player 
graph  games,  both  the  qualitative  and  the  quantitative  solution  problems 
are  NP-complete  in  the  case  of  Rabin  conditions,  and  coNP-complete  in  the 
case  of  Streett  conditions.  The  NP-hardness  for  Rabin  conditions  follows 
from  NP-hardness  of  2-player  games  with  Rabin  conditions  [12,  23];  we  es¬ 
tablish  the  membership  in  NP.  Both  questions  have  been  known  to  be  in 
NP  n  coNP  for  the  more  restrictive,  self-dual  parity  conditions  [18,  4,  24], 
whose  exact  complexity  is  an  important  open  problem. 

Our  proof  of  membership  in  NP  for  stochastic  Rabin  games  relies  on 
establishing  the  existence  of  pure  memoryless  almost-sure  winning  and  op¬ 
timal  strategies.  The  corresponding  result  for  stochastic  parity  games  has 
been  proved  only  recently  [18,  4,  24];  the  proofs  rely  heavily  on  the  self¬ 
duality  of  parity  conditions.  For  Rabin  conditions,  a  new  proof  approach 
is  required.  First,  we  show  the  existence  of  pure  memoryless  almost-sure 
winning  strategies  in  stochastic  Rabin  games;  the  proof  is  based  on  a  reduc¬ 
tion  from  2Y2-player  games  to  2-player  games  that  preserves  the  ability  of 
player  1  to  win  with  probability  1  (but  not,  obviously,  the  maximal  prob¬ 
ability  of  winning).  The  proof  technique  is  different  from  the  techniques 
for  parity  games  [3]  that  relies  on  the  notion  of  ranking  functions  and  self¬ 
duality  of  parity  conditions.  The  present  proof  technique  is  combinatorial 
and  uses  graph  theoretic  arguments  to  take  care  of  the  fact  that  Rabin  ob¬ 
jectives  are  not  closed  under  complementation.  Our  reduction  establishes 
the  membership  in  NP  of  the  qualitative  solution  problem  for  stochastic 
Rabin  games.  To  show  the  existence  of  pure  memoryless  optimal  strategies. 
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we  partition  the  game  graph  into  value  classes,  each  consisting  of  states 
where  the  value  of  the  game  is  identical.  We  show  that  if  the  players  play 
according  to  optimal  strategies,  then  the  game  leaves  every  intermediate 
value  class  (in  which  the  value  is  neither  0  nor  1)  with  probability  1.  We 
can  then  leverage  the  results  on  almost-winning  to  show  the  existence  of 
pure  memoryless  optimal  strategies,  and  establish  the  membership  in  NP 
also  for  the  quantitative  solution  problem  for  stochastic  Rabin  games.  The 
coNP-completeness  of  stochastic  Streett  games  follows  by  duality. 

We  emphasize  that,  as  mentioned  earlier,  the  existence  of  pure  mem¬ 
oryless  strategies  is  relevant  in  its  own  right,  as  such  strategies  consist  in 
mappings  associating  with  each  player- 1  state  a  unique  successor,  without 
need  for  randomization  or  memory;  such  mappings  are  easily  implemented 
in  controllers.  The  result  that  a  pure  memory  less  strategy  suffices  for  win¬ 
ning  with  probability  1  and  for  optimality  in  every  stochastic  Rabin  game 
is  far  from  obvious;  recall  that  Streett  conditions  in  general  require  memory 
even  in  the  simpler  case  of  non-stochastic  (i.e.,  2-player)  graph  games.  Fur¬ 
thermore,  our  techniques  lead  us  to  a  far  more  general  result,  that  states 
a  strong  connection  between  the  qualitative  and  quantitative  problems:  we 
show  that  for  any  w-regular  objective  in  a  2  Y2-player  game  graph,  if  a  family 
of  strategies  suffices  for  almost-sure  winning,  it  also  suffices  for  optimality. 
Hence  future  research  about  2  Y2-player  games  with  w-regular  objectives  can 
focus  on  qualitative  winning  strategies,  and  our  result  generalizes  qualitative 
winning  strategies  to  quantitative  winning  strategies. 

2  Preliminaries 

We  consider  several  classes  of  turn-based  games,  namely,  two-player  turn- 
based  probabilistic  games  (2  Y2-player  games),  two-player  turn-based  deter¬ 
ministic  games  (2-player  games),  and  Markov  decision  processes  (1  Y2-player 
games). 

Probability  distribution.  For  a  countable  set  A,  a  probability  distribu¬ 
tion  on  the  set  M  is  a  function  ^  M  — )■  [0, 1]  such  that  XlaeA  ~ 
denote  the  set  of  probability  distributions  on  the  set  A  by  T>(A). 

Game  graphs.  A  turn-based  probabilistic  game  graph  {2^/2-player  game 
graph)  G  =  ((S,  E),  Si,  S2,  Sq,  5)  consists  of  a  directed  graph  {S,  E),  a  par¬ 
tition  (Si,  S2,  Sq)  of  the  set  of  states  S,  and  a  probabilistic  transition 
function  5:  Sq  — )■  'P(S'),  where  'P(S')  denotes  the  set  of  probability  distri¬ 
butions  over  the  state  space  S.  The  states  in  Si  are  the  player-1  states, 
where  player  1  decides  the  successor  state;  the  states  in  S2  are  the  player-2 
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states,  where  player  2  decides  the  successor  state;  and  the  states  in  Sq  are 
the  probabilistic  states,  where  the  successor  states  is  chosen  according  to  the 
probabilistic  transition  function  5.  We  assume  that,  for  s  G  Sq  and  t  G  S', 
we  have  (s,t)  G  .E  iff  S{s){t)  >  0,  and  we  often  write  5{s,t)  for  5{s){t).  For 
technical  convenience  we  assume  that  in  (S,  E)  every  state  has  at  least  one 
outgoing  edge,  and  we  write  t  G  E{s)  for  (s,t)  G  E.  For  a  state  s  we  write 
E{s)  to  denote  {  t  G  S  |  (s,  t)  G  £'  }.  We  denote  by  n  the  size  of  the  state 
space,  i.e.,  n  =  |S|,  and  by  m  the  number  of  edges,  i.e.,  m  =  \E\. 

An  infinite  path,  or  play,  of  the  game  graph  G  is  an  infinite  sequence 
u)  =  (so,  51,  S2, . . .)  of  states  such  that  {s^,  sa:+i)  ^  ^  for  all  A:  G  N.  We  write 
for  the  set  of  all  plays,  and  for  every  state  s  G  S  we  write  for  the  set 
of  plays  that  start  from  the  state  s. 

A  set  17  C  S  of  states  is  called  5 -closed  if  for  every  u  G  17  fl  Sq,  we 
have  that  {u,t)  G  E  implies  t  G  17;  it  is  called  5-live  if  for  every  state 
s  G  17  n  (S'!  U  S2)  there  is  a  state  t  ^  U  such  that  (s,  t)  G  A  ^-closed  and 
^-live  subset  of  S  induces  a  subgame  graph  of  G,  indicated  by  G  f  17. 

The  turn-based  deterministic  game  graphs  {2-player  game  graphs)  are 
the  special  case  of  the  21/2-player  game  graphs  with  Sq  =  0.  The  Markov 
decision  processes  {l^/2-player  game  graphs)  are  the  special  case  of  the  21/2- 
player  game  graphs  with  Si  =  0  or  52  =  0.  We  refer  to  the  MDPs  with 
^2  =  0  as  player-1  MDPs,  and  to  the  MDPs  with  5i  =  0  as  player-2  MDPs. 
A  game  graph  which  is  both  deterministic  and  an  MDP  is  called  a  transition 
system  {1-player  game  graph):  a  player-1  transition  system  has  only  player-1 
states;  a  player-2  transition  system  has  only  player-2  states. 

Strategies.  A  strategy  for  player  1  is  a  function  a:  5*  •  5i  — )■  D{S)  that 
assigns  a  probability  distribution  to  every  finite  sequence  w  G  S*-Si  of  states, 
which  represents  the  history  of  the  play  so  far.  Player  1  follows  the  strategy  a 
if  in  each  move,  given  that  the  current  history  of  the  play  is  re  G  5*  •  5i, 
she  chooses  the  next  state  according  to  the  probability  distribution  a{w). 
A  strategy  must  prescribe  only  available  moves,  i.e.,  for  all  re  G  5*,  s  G  5i, 
and  t  G  5,  if  a{w  ■  s){t)  >  0,  then  {s,t)  G  E.  The  strategies  for  player  2 
are  defined  analogously.  We  denote  by  S  and  11  the  set  of  all  strategies  for 
player  1  and  player  2,  respectively.  Note  that  for  player- 1  MDPs  the  set  11 
is  a  singleton,  i.e.,  player  2  has  only  a  single  trivial  strategy. 

Pure  strategies.  We  classify  strategies  according  to  their  use  of  random¬ 
ization  and  memory.  The  strategies  that  do  not  use  randomization  are  called 
pure.  A  player-1  strategy  a  is  pure  if  for  all  re  G  5*  and  s  G  5i,  there  is  a 
state  t  G  5  such  that  a{w  ■  s){t)  =  1.  The  pure  strategies  for  player  2  are 
defined  analogously.  We  denote  by  and  11^  the  sets  of  pure  strategies 
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for  player  1  and  player  2,  respectively.  A  strategy  that  is  not  necessarily 
pure  is  called  randomized. 

Finite  memory  and  memoryless  strategies.  Let  M  be  a  set  called 
memory.  A  strategy  with  memory  can  be  described  as  a  pair  of  func¬ 
tions:  (a)  memory  update  function  cr„  :  S'  x  M  — )■  M,  (b)  next  move  function 
am  :  Si  X  M  — )■  V{S).  A  strategy  is  finite-memory  if  the  memory  M  is  finite. 
We  denote  by  the  set  of  finite- memory  strategies  for  player  1,  and  by 
the  set  of  pure  finite-memory  strategies;  that  is,  n  S^.  A 

strategy  is  memoryless  if  |M|  =  1:  hence,  the  next  move  does  not  depend  on 
the  history  but  only  on  the  current  state.  A  memoryless  strategy  for  player  1 
can  be  represented  as  function  a:  Si  — )■  'D(S)  such  that  for  all  s  G  Si  and 
t  G  S,  if  a{s){t)  >  0,  then  (s,t)  G  E'.  A  pure  memoryless  strategy  is  a 
pure  strategy  that  is  memory  less.  A  pure  memory  less  strategy  for  player  1 
can  be  represented  as  a  function  a:  Si  — )■  S  such  that  (s,cr(s))  G  E  for  all 
s  G  Si.  We  denote  by  the  set  of  memoryless  strategies  for  player  1,  and 
by  the  set  of  pure  memoryless  strategies;  that  is,  fl  S^. 

Analogously  we  define  the  corresponding  strategy  families  for  player  2. 

Given  a  strategy  cr  G  S  for  player  1,  we  write  G(j  for  the  game  played  on 
the  graph  G  under  the  constraint  that  player  1  follows  the  strategy  a.  The 
corresponding  definition  for  a  player-2  strategy  is  analogous.  Observe  that 
given  a  2Y2-player  game  graph  G  and  a  memoryless  player-1  strategy  a, 
the  result  G(j  is  a  player-2  MDP.  Similarly,  for  a  player- 1  MDP  G  and  a 
memoryless  player-1  strategy  a,  the  result  G(j  is  a  Markov  chain.  Hence,  if 
G  is  a  2 1/2-player  game  graph  and  the  two  players  follow  given  memoryless 
strategies  a  and  tt,  the  result  Gcr,7r  is  a  Markov  chain.  Given  a  game  graph 
G  and  a  finite  memory  strategy  a  for  player  1  with  memory  M,  the  strategy 
a  can  be  interpreted  as  a  memoryless  strategy  am  in  the  usual  synchronous 
product  game  graph  G  with  the  memory  M,  i.e.,  G  x  M.  Analogous  observa¬ 
tions  hold  for  player  2  strategies  tt.  These  observations  will  be  useful  in  the 
analysis  of  2Y2-player  games. 

Once  a  starting  state  s  G  S'  and  strategies  cr  G  S  and  tt  G  H  for  the  two 
players  are  fixed,  the  outcome  of  the  game  is  a  random  walk  ujs’^  for  which 
the  probabilities  of  events  are  uniquely  defined,  where  an  event  M  C 
is  a  measurable  set  of  paths.  Given  strategies  a  for  player  1  and  tt  for 
player  2,  a  play  u)  =  (sO)  51,525  •  •  •)  is  feasible  in  a  2Y2-player  game  graph 
if  for  every  A:  G  N  the  following  three  conditions  hold:  (1)  if  s^  G  Sq,  then 
{sk,Sk+i)  G  E;  (2)  if  sa;  G  Si,  then  cr(so,  5i, . . . ,  5a;)(5a;+i)  >  0;  and  (3)  if 
Sk  G  S2  then  71(59,  5i, . . . ,  5a;)(5a;+i)  >  0.  Given  strategies  cr  G  S  and  tt  G  H, 
and  a  state  5,  we  denote  by  Outcome(5,  cr,  tt)  C  the  set  of  feasible  plays 
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that  start  from  s  given  strategies  a  and  tt.  For  a  state  s  G  S'  and  an  event 
A,  we  write  {A)  for  the  probability  that  a  path  belongs  to  A  if  the 
game  starts  from  the  state  s  and  the  players  follow  the  strategies  cr  and  tt, 
respectively.  In  the  context  of  player- 1  MDPs  we  often  omit  the  argument  tt, 
because  11  is  a  singleton  set. 

Objectives.  An  objective  for  a  player  consists  in  an  w-regular  set  of  winning 
plays  $  C  If  for  each  player  [22].  In  this  paper  we  study  only  zero-sum  games 
[20,  13],  where  the  objectives  of  the  two  players  are  complementary.  In  other 
words,  it  is  implicit  that  if  the  objective  of  one  player  is  $,  then  the  objective 
of  the  other  player  is  If  \  $.  Given  a  game  graph  G  and  an  objective  $  C  If, 
we  write  (G,  $)  for  the  game  played  on  the  graph  G  with  the  objective  $ 
for  player  1. 

In  this  paper  we  consider  w-regular  objectives  specified  as  Rabin  and 
Strett  objectives.  For  a  play  u)  =  {sq,  si,  S2,  ■  ■  ■)  G  12,  we  define  Inf(a;)  = 
{  s  G  S'  I  =  5  for  infinitely  many  A:  >  0  }  to  be  the  set  of  states  that  occur 
infinitely  often  in  w.  We  use  colors  to  define  objectives  independent  of  game 
graphs.  For  a  set  G  of  colors,  we  write  [•]:  G  — )■  2'^  for  a  function  that  maps 
each  color  to  a  set  of  states.  Inversely,  given  a  set  G  C  S  of  states,  we  write 
[G]  =  {  c  G  G  I  [c]  n  G  /  0  }  for  the  set  of  colors  that  occur  in  G.  Note  that 
a  state  can  have  multiple  colors. 

1.  Reachability  and  safety  objectives.  Given  a  color  c,  the  reachability 
objective  requires  that  some  state  of  color  c  be  visited.  Let  T  =  [c]  be 
the  set  of  so-called  target  states.  Formally,  we  write  Reach(T)  =  {a;  = 
(sO)  51,  S2,  ■  ■  ■)  &  ^  \  Sk  &  T  for  some  A:  >  0  }  for  the  set  of  winning 
plays.  Given  c,  the  safety  objective  requires  that  only  states  of  color 
c  be  visited.  Let  F  =  [c]  be  the  set  of  so-called  safe  states.  Formally, 
the  set  of  winning  plays  is  Safe(F)  =  {  a;  =  (sq,  5i,  52,  •  •  •)  G  ^  I  5^;  G 
F  for  all  A:  >  0  } . 

2.  Rabin,  parity,  and  Streett  objectives.  Given  a  set  P  = 

{(ei,  /i), . . . ,  (e^,  fd)}  of  pairs  of  colors,  the  Rabin  objective  requires 
that  for  some  1  <  *  <  d,  all  states  of  color  e*  be  visited  finitely 
often  and  some  state  of  color  /j  be  visited  infinitely  often.  Let 
P  =  {(Pi,  Pi), ... ,  {Ed,  Fd)}  be  the  corresponding  set  of  so-called  Ra¬ 
bin  pairs,  where  Pj  =  [e*]  and  Fi  =  [/j]  for  all  1  <  *  <  d.  For¬ 
mally,  the  set  of  winning  plays  is  Rabin(P)  =  {a;Gf2|  31< 

i  <  d.  (Inf(a;)  fl  Pj  =  0  A  Inf(a;)  fl  Pj  /  0)  }.  Without  loss  of 
generality,  we  require  that  (IJie{i2  d}i^i  ~  parity 

(or  Rabin-chain)  objectives  are  the  special  case  of  Rabin  objectives 
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where  Ei  C  Fi  C  E2  C  F2  ■  ■  ■  C  C  F^.  The  Rabin-chain  ob¬ 
jective  can  be  represented  as  a  parity  objective  defined  as  follows: 
define  a  priority  function  p  that  labels  each  state  in  Ei  \  Fj_i  by 
a  priority  2i  —  1  and  each  state  in  Fi  \  £'j  by  a  priority  2i.  The 
parity  objective  requires  that  minimum  priority  state  that  is  vis¬ 
ited  infinitely  often  is  even.  Formally,  the  set  of  winning  plays  is 
Parity(p)  =  {  a;  G  fi  |  min(p(Inf(a;)))  is  even  }.  Given  P,  the  Streett 
objective  requires  that  for  each  1  <  *  <  d,  if  some  state  of  color  /j  is 
visited  infinitely  often,  then  some  state  of  color  e*  is  visited  infinitely 
often.  Formally,  for  the  set  P  =  {(Pi,  Pi), . . . ,  (P^,  P^)}  of  so-called 
Streett  pairs,  the  set  of  winning  plays  is  Streett  (P)  =  {a;Gfi|  Vl< 
i  <  d.  (Inf(a;)  DEi  ^(l)  V  Inf(a;)  flPj  =  0)  }.  Note  that  the  Rabin  and 
Streett  objectives  are  dual.  Moreover,  every  parity  objective  is  both  a 
Rabin  objective  and  a  Streett  objective.  Hence,  parity  objectives  are 
closed  under  complementation. 

We  commonly  use  terminology  like  the  following:  a  2^l2-player  Rabin 
game  (G,  Rabin(P))  consists  of  a  2Y2-player  game  graph  G  and  a  Rabin 
objective  for  player  1. 

Values  of  a  game.  Given  w-regular  objectives  $  C  fi  for  player  1  and 
fi  \  $  for  player  2,  we  define  the  value  functions  {{l))val  and  {{2))yal  for  the 
players  1  and  2,  respectively,  as  follows: 

{{l))vam{s)  =  sup  inf  Pr-’-($)  {{2)),al{n\^){s)  =  sup  inf  Pr-’-(fi\$) 

A  strategy  a  for  player  1  is  optimal  from  state  s  for  objective  $  if 

{{1)),«Y^)(5)  =  inf  Pr^’-($). 
ttGII 

The  optimal  strategies  for  player  2  are  defined  analogously.  The  quantitative 
determinacy  of  2Y2-player  games  with  Rabin  objectives  follows  from  the 
result  of  Martin  [16]. 

Theorem  1  (Quantitative  determinacy  [16])  For  all  2^l2-player  game 
graphs,  all  Rabin  objectives  and  all  states  s, 

{{l)),aims)  +  {{2)),al{n\^){s)  =  G 

Sure,  almost-sure  and  limit-sure  winning  strategies.  Given  an  ob¬ 
jective  $,  a  strategy  cr  is  a  sure  winning  strategy  for  player  1  from  a  state 
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s  if  for  every  strategy  tt  of  player  2  we  have  Outcome(s,  cr,  tt)  C  A  strat¬ 
egy  a  is  an  almost-sure  winning  strategy  for  player  1  from  a  state  s  for  the 
objective  $  if  for  every  strategy  tt  of  player  2  we  have  Prg’^($)  =  1.  A 
family  of  strategies  are  limit-sure  winning  for  player  1  from  a  state  s  if 
sup^£j]C  inf jrgn  Pi’s ’^(^)  =  1-  The  sure,  almost-sure  and  limit-sure  winning 
strategies  for  player  2  are  defined  analogously.  Given  an  objective  the  sure 
winning  set  {{l))s)(re(^)  for  player  1  is  the  set  of  states  from  which  player  1 
has  a  sure  winning  strategy.  The  almost-sure  winning  set  {{^)) almosti^)  for 
player  1  is  the  set  of  states  from  which  player  1  has  an  almost-sure  win¬ 
ning  strategy.  The  limit-sure  winning  set  {{^)) for  player  1  is  the  set 
of  states  from  which  player  1  has  limit-sure  winning  strategies.  The  sure 
winning  set  {{2))surei^  \  the  almost-sure  winning  set  {{2}) almost  \ 
and  the  limit-sure  winning  set  {{2))iimiti^  \  for  player  2  are  defined  anal¬ 
ogously.  It  follows  from  the  definitions  that  for  all  2  r/2-player  game  graphs 
and  all  objectives  we  have  {{l))snre(^)  ^  {{!)) almost {^)  ^  {{l))iimit{^) 
and  {{2))sure{^  \  ^  P)) almost \  ^  {{‘^)) limit \  ^)-  Computing  sure 

winning,  almost-sure  winning  and  limit-sure  winning  sets  and  strategies  is 
referred  to  as  the  qualitative  analysis  of  2  Y2-player  games  [8].  The  following 
result  is  the  classical  determinacy  result  for  2-player  deterministic  games. 

Theorem  2  (Qualitative  determinacy  [17])  For  all  2-player  game 
graphs  and  all  Rabin  objectives  we  have 

imsurei^)  n  {{2))sure{^  \  $)  =  0;  {(1))  (^)  =  {{l)).nre  (^); 

U  {{2))sure{^  \^)  =  S;  {{2))almost{^  \  $)  =  m)sure{^  \  $)• 

Sufficiency  of  a  family  of  strategies.  Let  C  G  {P,  M,  P,  PM,  PF}  and 

consider  the  family  Yf  of  special  strategies  for  player  1.  We  say  that  the 
family  suffices  with  respect  to  an  objective  $  on  a  class  Q  of  game  graphs 
for 


•  sure  winning  if  for  every  game  graph  G  €  Q,  for  every  s  G  {{l))s)(re(^) 
there  is  a  player- 1  strategy  cr  G  such  that  for  every  player-2  strategy 
TT  G  n  we  have  Outcome (s,  cr,  tt)  C 

•  almost-sure  winning  if  for  every  game  graph  G  €  Q,  for  every  state 
s  G  {{1))  almost  {^)  there  is  a  player-1  strategy  cr  G  such  that  for 
every  player-2  strategy  tt  G  11  we  have  Prg’^($)  =  1; 

•  limit-sure  winning  if  for  every  game  graph  G  €  Q,  for  every  state 
5  G  imiimiti^)  we  have  sup^^^c  inf^en  Pr^’^($)  =  1; 
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•  optimality  if  for  every  game  graph  G  ^  Q,  for  every  state  s  G  S'  there  is 
a  player-1  strategy  cr  G  such  that  {{l))„a;($)(s)  =  infjrgn 

For  sure  winning,  the  lY2-player  and  21/2-player  games  coincide  with 
2-player  deterministic  games  where  the  random  player  (who  chooses  the 
successor  at  the  probabilistic  states)  is  interpreted  as  an  adversary,  i.e.,  as 
player  2.  This  is  formalized  by  the  proposition  below. 

Proposition  1  If  a  family  Yf  of  strategies  suffices  for  sure  winning  with 
respect  to  an  u-regular  objective  $  on  all  2-player  game  graphs,  then  the 
family  Yf  suffices  for  sure  winning  with  respect  to  $  also  on  all  l^/2-player 
and  2^l2-player  game  graphs. 

The  following  result  is  the  classical  determinacy  result  for  2-player  de¬ 
terministic  graph  games  with  Rabin  and  Streett  objectives. 

Theorem  3  (Pure  memoryless  and  finite-memory  strategies)  1. 

[12,  10]  The  family  of  pure  memoryless  strategies  suffices  for 

sure  winning  with  respect  to  all  Rabin  objectives  on  2-player  game 
graphs. 

2.  [14]  The  family  of  pure  finite-memory  strategies  suffices  for  sure 
winning  with  respect  to  all  Streett  objectives  on  2-player  game  graphs. 


3  MDPs,  End  Components,  and  Streett  objectives 

In  this  section  we  develop  some  facts  on  end  components  [7]  that  are  needed 
for  the  further  developments  of  the  paper.  We  consider  player- 1  MDPs  and 
hence  strategies  for  player  1.  Let  G  =  {{S,E),  {Si,  S2,  Sq),5)  with  82  =  $ 
be  a  1  Y2-player  game  graph. 

Definition  1  (End  component)  A  set  U  C  S  of  states  is  an  end- 
component  if  U  is  5 -closed  and  the  subgame  graph  G  \  U  is  strongly  con¬ 
nected.  I 

We  denote  by  £■  C  2'^  the  set  of  all  end-components  of  G.  The  next 
lemma  states  that,  under  any  strategy  (memoryless  or  not),  with  proba¬ 
bility  1  the  set  of  states  visited  infinitely  often  along  a  play  is  an  end- 
component.  This  lemma  allows  us  to  derive  conclusions  on  the  (infinite) 
set  of  plays  in  an  MDP  by  analyzing  the  (finite)  set  of  end  components  in 
the  MDP.  In  particular,  the  lemma  implies  that  to  show  that  a  Streett 
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(resp.  Rabin)  condition  {(ei, /i), . . . ,  (e^,  Z^)}  is  satisfied  with  probabil¬ 
ity  1,  it  suffices  to  show  that  for  all  reachable  end  components  U,  we  have 
that  V*  €  [l..d].{U  f)  Ei  /  0  V  17  fl  Fj  =  0)  (resp.,  for  Rabin  conditions, 
3i  G  [l..d\.{U  nFj  =  0Al7nFj/  0)).  To  state  the  lemma,  for  s  G  S'  and 
17  C  S,  we  define  =  {  a;  G  fig  |  Inf(a;)  =  17  }. 

Lemma  1  [7]  For  all  state  s  ^  S  and  strategies  a  ^  T,,  we  have 

Next,  we  present  a  polynomial-time  algorithm  for  computing  the  maxi¬ 
mal  probability  of  satisfying  a  Streett  condition  in  an  MDP;  the  algorithm 
will  be  used  in  later  sections  to  argue  that  certain  witnesses  can  be  checked 
in  polynomial  time.  Consider  a  set  P  =  {(Fi,  Fi), . . . ,  (F^,  F^)}  of  Streett 
pairs.  Let  U  G  7/  iff  17  G  S  and  for  all  1  <  *  <  d,  we  have  either  U  Ci  Ei  ^  $ 
or  17  n  Fj  =  0.  The  following  lemma  states  that  the  maximal  probabil¬ 
ity  of  satisfying  Streett(F)  is  equal  to  the  maximal  probability  of  reaching 
Tend  =  U(7eW  ^ ' 

Lemma  2  [2]  {{l))„aiStreett(F)  =  {{ Reach (Te„d). 

We  present  a  polynomial-time  algorithm  for  computing  Tend]  the  com¬ 
putation  of  the  value  then  reduces  to  computing  values  of  a  MDP  with  a 
reachability  objective  which  can  be  achieved  by  linear  programming  [6].  To 
state  the  algorithm,  we  say  that  an  end-component  17  C  S  is  maximal  in 
F  C  S  if  17  C  F,  and  if  there  is  no  end-component  U'  with  17  C  17'  C  F. 
Given  a  set  F  C  S,  we  denote  by  MaxEC(F)  the  set  consisting  in  all  max¬ 
imal  end  components  U  such  that  U  C  V.  This  set  can  be  computed  in 
quadratic  time  with  standard  graph  algorithms;  see,  e.g.,  [7].  The  set  Tend 
can  be  computed  with  the  following  algorithm. 

L  :=  MaxEC(5);  F  :=  0 

while  F  /  0  do 

choose  U  €  L  and  let  F  :=  F  \  {  17  } 
ifV*  G  [l..d].(17nFi  /0  vFnFj  =  0) 
then  F  :=  F  U  {  17  } 

else  choose  i  G  [l..d]  such  that  U  D  Ei  ^  and 
let  F  :=  FUMaxEC(F\Fi) 

end  if 
end  while 

Return:  Tend  =  UueD  U- 
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It  is  easy  to  see  that  every  state  s  G  S'  is  considered  as  part  of  an  end- 
component  in  the  else-part  of  the  above  algorithm  at  most  once  for  every 
1  <  *  <  d;  hence,  the  algorithm  runs  in  time  polynomial  in  |G|  and  |P|. 

4  Almost-sure  winning  strategies  in  Rabin  games 

In  this  section  we  show  that  pure  memoryless  strategies  suffices  for  almost- 
sure  winning  with  respect  to  Rabin  objectives  on  2Y2-player  game  graphs. 
The  result  is  achieved  by  a  reduction  to  2-player  Rabin  games.  This  also 
gives  a  direct  proof  of  the  fact  that  the  limit-sure  and  almost-sure  winning 
sets  coincide  in  2  Y2-player  games  with  Rabin  objectives.  Since  any  w-regular 
objective  can  be  expressed  as  a  Rabin  objective  the  result  holds  for  all  cj- 
regular  objectives  in  2Y2-player  games.  Moreover,  the  reduction  allows  us 
to  apply  the  algorithms  for  2-player  Rabin  games  for  qualitative  analysis 
of  2Y2-player  games  with  Rabin  objectives.  In  the  next  section,  we  use 
the  existence  of  pure  memoryless  almost-sure  winning  strategies  to  prove 
existence  of  pure  memoryless  optimal  strategies. 

4.1  Reduction 

Given  a  2  Y2-player  Rabin  game  (G  =  ((S,  E),  (Si,  S2,  Sq),  ^),  [•]  :  S'  — )■  2^  \ 
0),  where  P  =  {  (ei,  /i),  (02,  /2),  •  •  • ,  (e^,  fd)  }  is  a  set  of  d  pairs  of  colors,  we 
construct  a  2-player  Rabin  game  (G  =  ((S,  E),  (Si,  S2,  Sq)),  [•]  :  S  — )■  2^\0). 
The  construction  is  described  as  follows:  for  every  state  s  G  Si  U  S2,  there 
is  a  state  s  G  S  with  “the  same”  outgoing  edges,  i.e.,  (s,t)  G  P  if  and  only 
if  (s,t)  G  E.  Each  probabilistic  state  s  G  Sq  is  substituted  by  the  gadget 
presented  in  Figure  1.  More  formally,  the  players  play  the  following  3-step 
game  in  G  from  a  probabilistic  state  s.  For  the  state  s  we  have  [s]  =  [s]. 
First,  in  vertex  s  player  2  chooses  a  successor  (s,  2k),  for  A:  G  {0, 1,  2, . . .  ,  d}. 
For  every  state  (s,2k)  we  have  [(s,  2A:)]  =  [s].  For  A:  >  1,  in  state  (s,2k) 
player  1  chooses  from  two  successors:  state  (s',  2k  —  1)  with  [(s',  2k  —  1)]  =  Ck; 
state  ('s,2k)  with  [(s',  2A:)]  =  fk-  In  state  (s,  0)  there  is  only  one  successor 
(^,  0)  with  [(^,  0)]  =  {  /i,  /2, . . . ,  fd  }.  Finally,  in  a  state  (s',  k)  the  choice  is 
between  all  states  t  such  that  (s,  t)  G  E,  and  it  belongs  to  player  1  if  A:  is 
odd,  and  to  player  2  if  A:  is  even. 

Let  Ui  and  U2  be  the  sure- winning  sets  for  players  1  and  2,  respectively, 
in  the  2-player  Rabin  game  G.  Define  sets  Ui  and  U2  of  states  in  the  2Y2- 
player  Rabin  game  G  by  Pi  =  {s  G  S'  |  s  G  Pi },  and  P2  =  {s  G  S  |  s  G  P2}. 
By  the  determinacy  of  2-player  Rabin  games  [11]  (Theorem  3)  we  have  that 
Pi  U  P2  =  S,  and  hence  Pi  U  P2  =  S. 
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Figure  1:  The  gadget  for  the  reduction  of  a  2Y2-player  parity  game  to  a 
2-player  parity  game. 


Definition  2  (Winning  strongly  connected  component  and  end  components) 

Let  G  be  a  1-player  game  graph  with  a  Rabin{P)  objective  for  player  1 
and  P  =  {  (ei,  /i),  (e2,  /2),  •  •  • ,  (e^,  fd)  }  of  d  pairs  of  colors.  A  strongly 
connected  component  (s.c.c)  C  in  G  is  winning  for  player  1  if  there  is  a 
i  G  {  1,  2, . . . ,  d  }  such  that  C  fl  Fj  /  0  and  G  Cl  =  0;  otherwise  G  is 
winning  for  player  2.  If  G  is  a  MDP  with  the  set  P  of  colors,  then  an  end 
component  G  in  G  is  winning  for  player  1  if  there  is  an  i  G  {1,2,...  ,d} 
such  that  GriFi  /  0  and  G  CiEi  =  0;  otherwise  G  is  winning  for  player  2.  I 

Lemma  3  Let  G  be  a  2^l2-player  game  graph,  and  let  P  = 

{  (ei,/i),  (e2,/2),  •  •  • ,  (ed,/d)  }  be  a  set  of  pairs  of  colors,  and  let  P  = 

{{El,  El), . . . ,  {Ed,  Ed)}  be  the  corresponding  sets  of  pairs  of  states.  There 
exists  pure  memoryless  strategy  a  for  player  1  in  the  game  G,  such  that 
for  all  strategy  n  for  player  2  we  have  Pig’^ {Rabin{P))  =  1,  for  all  states 
s  eUi.  Hence  Ui  C  {{1}) almost {Rabin{P)). 

Proof.  We  define  a  pure  memory  less  strategy  a  for  player  1  in  the  game 
G  from  a  strategy  a  in  the  game  G  as  follows:  for  all  state  s  ^  Si,  if  ct(s)  =  t 
then  set  a{s)  =  t.  Consider  a  pure  memoryless  sure  winning  strategy  a  in 
the  game  G  from  every  state  s  G  f7i.  Our  goal  is  to  establish  that  a  is  an 
almost-sure  winning  strategy  from  every  state  in  Ui. 

We  prove  that  every  end  component  in  the  player-2  MDP  {G  \  Ui)a^  is 
winning  for  player  1.  It  would  follow  from  Lemma  1  that  a  is  an  almost-sure 
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Figure  2:  The  strategy  sub-graph  in  ■ 


winning  strategy.  We  argue  that  if  there  is  an  end  component  in  {G  \  Ui)a^ 
that  is  winning  for  player  2  then  we  can  construct  an  s.c.c  in  the  subgraph 
{G  \  Ui)-^  that  is  winning  for  player  2,  which  is  impossible  because  ct  is  a  sure 
winning  strategy  for  player  1  from  the  set  17i  in  the  2-player  Rabin  game 
G.  Let  G  be  an  end  component  in  {G  \  Ui)a^  that  is  winning  for  player  2. 
We  denote  by  G  the  set  of  states  in  the  gadget  of  states  in  G.  Hence  for  all 
i  €  {l,2,...,d}we  have  if  Fj  n  C  /  0  then  G  CiEi  ^  (I).  Let  us  define  the  set 
I  =  ■  ■  ■  ,ij}  such  that  fl  C  /  0.  Thus  for  alH  G  ({  1,  2, . . .  ,  d}  \7) 

we  have  Fj  n  C  =  0.  Note  that  7  /  0,  as  every  state  has  at  least  one  color. 
We  now  construct  a  sub-game  in  G-^  as  follows: 

•  For  a  state  s  ^  G  Cl  S2  keep  all  the  edges  (s,  t)  such  that  t  ^  G. 

•  For  a  state  s  G  C  fl  Sq  the  sub-game  is  defined  as  follows: 

—  At  state  s  choose  the  edges  to  state  (s,  2i)  such  that  *  G  7. 

—  For  a  state  (^,2*),  player  2  chooses  a  successor  that  shortens  the 
distance  to  the  vertex  set  C  fl  Fj  in  the  game  G. 

The  construction  is  illustrated  in  Fig.  2. 

We  now  prove  that  every  terminal  s.c.c.  is  winning  for  player  2  in  the 
subgame  thus  constructed  in  {G  \  C)^,  where  G  is  the  set  of  states  in 
the  gadget  of  states  in  G.  Consider  any  arbitrary  terminal  s.c.c  Y  in  the 
subgame  constructed  in  {G  \  G)-^.  It  follows  from  the  construction  that  for 
every  *  G  {  1,  2, . . . ,  d  }  \  7  we  have  Fj  n  T  =  0.  Suppose  for  a  i  ^  I  we  have 
Fj  n  y  /  0,  we  show  that  Fj  n  T  /  0.  There  are  two  cases: 
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1.  If  there  is  at  least  one  state  (s,2i)  such  that  the  strategy  a  chooses 

the  successor  (^,  2i  —  1)  then  the  (lY  /  0  since  [(s,  2i  —  1)]  =  e*. 

2.  If  for  every  state  (s,  2i)  the  strategy  for  player  1  chooses  the  successor 
(s',  2i)  then  since  (^,  2i)  is  a  player  2  state  player  2  chooses  an  successor 
to  shorten  distance  to  the  vertex  set  E^  and  hence  the  terminal  s.c.c. 
Y  must  contain  a  state  s  such  that  [s]  =  Cj.  Hence  EiCiY  /  0. 

Now  we  argue  that  for  every  probabilistic  state  s  G  Sq  fl  Ui,  all  of  its 
successors  are  in  Ui.  Otherwise,  player  2  in  the  state  s  of  the  game  G  can 
choose  the  successor  (s,  0)  and  then  a  successor  to  its  winning  set  U2,  which 
contradicts  the  assumption  that  the  strategy  ct  is  a  sure  winning  strategy  for 
the  player  1  in  the  game  G.  It  follows  from  Lemma  1  that  for  any  strategy 
TT  with  probability  1  the  set  of  states  visited  infinitely  often  along  the  play 
a;cr,7r  is  an  end  component  in  Ui.  Since  every  end  component  in  {G  \  Ui)a^  is 
winning  for  player  1  the  strategy  a  is  an  almost-sure  winning  strategy  for 
player  1.  I 

Lemma  4  Let  G  be  a  2^l2-player  game  graph  with  a  set  P  = 
{  (ei,/i),  (62, /2),  •  •  • ,  (ed,/d)  }  of  d  pairs  of  colors  and  winning  objec¬ 
tive  Rabin{P)  for  player  1.  There  exists  finite-memory  strategy  n  for 

player  2  in  the  game  G  such  that  for  all  strategy  a  for  player  1  we 

have  Pi (StreetfiP))  >  0,  for  all  states  s  G  U2-  Hence  S  \  Ui  C 

S  \  {{!)) aimost{Rahin{P)) . 

Proof.  The  proof  idea  is  similar  to  the  proof  of  Lemma  3.  Consider  a 
finite- memory  sure  winning  strategy  T  for  player  2  in  the  game  G  \  U2'i  and 
TT  be  the  corresponding  strategy  in  G.  Let  M  be  the  memory  of  the  strategy 
Tt.  We  argue  that  every  end  component  in  the  game  {G  \  1/2)^  is  winning  for 
player  2.  Consider  the  product  game  (G  x  M  (  1/2  x  M)  and  the  corresponding 
memoryless  strategy  of  tt  in  the  game  G  x  M.  It  suffices  to  argue  that 
every  end  component  in  (G  x  M  (  1/2  x  is  winning  for  player  2.  Let  G 
be  a  end  component  in  (G  x  M  (  1/2  x  that  is  winning  for  player  1,  then 
we  construct  an  s.c.c.  that  is  winning  for  player  1  in  (G  x  M  (  1/2  x 
which  is  a  contradiction  since  ¥  is  a  sure  winning  strategy  for  player  2  in 
G  \  1/2-  We  describe  the  key  steps  to  construct  a  winning  s.c.c.  G  from  a 
winning  end  component  G;  mainly  we  describe  the  strategy  corresponding 
to  a  probabilistic  state.  If  G  is  a  winning  end  component  for  player  1  and 
let  i  be  the  witness  Rabin  pair  that  G  is  winning,  i.e.,  G  fl  Fj  /  0  and 
G  n  Fj  =  0.  The  strategy  for  player  1  is  as  follows: 
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Figure  3:  The  strategy  sub- graph  in  . 


•  If  the  strategy  for  player  2  at  a  state  (s,  m)  chooses  successor  ((s,  0),  m') 
then  the  following  successor  state  is  ((s',  0),  m')  and  since  [(s',  0)]  = 
{  /i)  /25  •  •  •  5  /d  }  player  1  ensures  that  a  state  in  Fj  is  visited. 

•  If  the  strategy  for  player  2  at  a  state  (s, m)  chooses  a  successor 
((s,  2*),m')  then  player  1  chooses  a  successor  ((s',  2*  —  l),m'),  where 
m,  m'  G  M.  Since  [(s,  2*)]  =  /j  player  1  ensures  that  a  state  in  Fj  is 
visited. 

•  If  the  strategy  for  player  2  at  a  state  (s, m)  chooses  a  successor 
((s,  2j),m'),  for  j  /  *,  then  player  1  chooses  a  successor  ((^,  2j  —  l),m') 
and  then  a  successor  to  shorten  distance  to  the  set  Fi,  where  m,  m'  G  M. 
Since  [(s,  2j  —  1)]  =  ej  /  e*,  player  1  ensures  that  a  state  in  Fj  is  not 
visited. 

The  construction  is  illustrated  in  Fig.  3. 

Consider  any  terminal  s.c.c.  Y  in  the  sub-game  thus  constructed.  The 
strategy  for  player  1  ensures  that  in  the  sub-game  C  whenever  a  state  s 
is  visited  such  that  s  G  Sq,  no  state  in  Fj  is  visited.  Since  C  fl  Fj  =  0 
it  follows  that  T  fl  Fj  =0.  Moreover,  the  strategy  for  player  1  ensures 
that  a  state  in  Fj  is  always  visited,  i.e.,  T  fl  Fj  /  0.  Hence  in  the  sub¬ 
game  of  (G  X  M  [  G  X  M)jr„j  every  terminal  s.c.c.  Y  is  winning  for  player  1, 
i.e.,  Fj  n  y  /  0  and  Fj  n  T  =  0.  However,  this  is  a  contradiction  since  n 
is  a  sure  winning  strategy  for  player  2.  Hence,  all  the  end-components  in 
(G  X  M  [  [/2  X  are  winning  for  player  2. 

Note  that  (GxM  [  172  xM)7rm  is  a  finite-state  player-1  MDP  and  if  player  1 
can  win  almost-surely  she  can  win  by  a  pure  memoryless  strategy.  Hence, 
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it  suffices  to  argue  that  player  2  wins  with  probability  greater  than  0  from 
every  state  s  G  (G  x  M  f  [/2  x  against  all  pure  memoryless  strategy 
a  for  player  1  in  (G  x  M  f  172  x  every  probabilistic  state  s  G 

Sq  n  G2,  at  least  one  successor  must  be  in  the  set  G2.  Otherwise  if  both 
the  successors  of  s  are  in  17i  it  follows  from  the  construction  of  the  gadget 
that  s  G  {{l))s^re  (Reach (Gi))  in  the  game  G.  In  other  words,  there  is  a 
strategy  for  player  1  in  the  2-player  game  to  reach  the  set  Ui  from  s;  this 
leads  to  s  G  Gi,  which  is  a  contradiction.  Hence  for  any  pure  memoryless 
strategy  a  consider  the  Markov  chain  (G  x  M  f  G2  x  M)o-,7r„j.  Prom  every  state 
s  G  G2  X  M  there  is  a  path  from  to  a  terminal  strongly  connected  component 
in  G2  X  M,  i.e.,  there  is  a  path  to  a  closed  recurrent  class  that  is  a  subset 
of  G2  X  M.  Every  end  component  is  winning  for  player  2  in  G2  x  M.  Hence, 
for  every  state  s  G  G2  x  M  there  is  a  path  to  a  closed  recurrent  class  that  is 
winning  for  player  2.  Therefore  for  any  pure  memoryless  strategy  cr,  in  the 
Markov  chain,  (G  x  M  f  G2  x  M)o-,7r„j ,  if  the  play  starts  at  any  state  s  G  G2  x  M 
there  is  a  positive  probability  that  it  reaches  a  terminal  strongly  connected 
component  that  is  winning  for  player  2.  Hence  the  desired  result  follows.  I 

It  follows  from  Lemma  3  and  Lemma  4  that  Gi  =  {{l))a;mos4 (Rabm(-P))- 
Moreover,  pure  memoryless  almost-sure  winning  strategies  exist  for  21/2- 
player  Rabin  games. 

Theorem  4  The  family  of  pure  memoryless  strategies  suffices  for 

almost-sure  winning  with  respect  to  Rabin  objectives  on  2^l2-player  game 
graphs. 

5  From  almost-sure  to  optimal 

In  this  section  we  show  how  to  extend  the  sufficiency  result  for  a  family  of 
strategies  from  almost-sure  to  optimality  for  any  w-regular  objective. 

Definition  3  (Value  Class)  Given  an  u-regular  objective  for  any  real 
r  G  M,  we  denote  by  VC(r)  the  value  class  with  value  r,  i.e.,  VC(r)  =  |  s  G 
S\{{l)),aim=r}.  I 

The  following  Proposition  states  that  there  exists  optimal  strategies  for 
player  1  such  that  they  never  choose  an  edge  to  a  lower  value  class. 

Proposition  2  For  all  u-regular  objectives  $  there  exists  optimal  strategy 
a  for  player  1  such  that  for  any  sequence  w  ^  S*  and  s  ^  Si  we  have 
a{w  ■  s)(t)  =  0  if  ((l}}yal^(t)  <  ((l}}yal^(s)- 
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The  following  Proposition  follows  from  Theorem  4. 

Proposition  3  ([3])  For  all  (jO-regular  objectives  $  and  for  all  21/2- 
player  game  graphs,  the  limit-sure  and  almost-sure  winning  sets  coincide: 
=  imalmosti^)  and  \  $)  =  {(2))  (ff  \  $). 

Definition  4  (Boundary  probabilistic  states)  Given  a  value  class 
VC(r)  a  probabilistic  state  s  is  a  boundary  probabilistic  vertex  if  there  ex¬ 
ists  a  successor  t  of  s  such  that  {{l))„a;$(t)  /  {{l))„a;$(s).  It  may  be  noted 
that  for  every  boundary  probabilistic  state  s,  there  exists  a  successors  ti,t2 
of  s  such  that  {{l))„ai^{ti)  <  {{l))„ai^{s)  and  {{l))„al^{t2)  >  {{l))yal^{s).  I 

Lemma  5  Consider  a  2^l2-player  game  with  an  uj-regular  objective 
Given  a  value  class  VC(r),  with  0  <  r  <  1,  let  B{r)  be  the  set  of  boundary 
probabilistic  states  of  the  value  class  VC(r).  Convert  each  of  the  state  in 
B{r)  to  a  sink  state  that  is  winning  for  player  1.  Let  the  new  game  be  G' . 
Then  player  1  wins  almost-surely  in  the  sub- game  G'  f  VC(r). 

Proof.  Assume  that  player  1  does  not  win  almost-surely  from  every  state 
in  G'  f  VC(r).  Then  there  exists  a  state  where  player  2  wins  with  positive 
bounded  probability.  It  follows  from  Corollary  1  of  [8]  and  Proposition  3 
that  there  exist  a  non-empty  set  U  C  VC(r)  such  that  that  player  2  wins 
almost-surely  from  U.  Consider  a  optimal  strategy  a  that  never  chooses  an 
edge  with  positive  probability  to  a  lower  value  class  (such  a  strategy  exist 
from  Proposition  2).  Since  player  2  wins  almost-surely  from  U  it  follows  that 
for  every  state  s  ^  U G  Si,  for  every  successor  t  of  s  in  VC(r)  we  have  t  ^  U. 
Note  that  it  follows  that  every  move  of  strategy  a  exists  in  U.  Hence  player  2 
wins  almost-surely  from  U  against  cr.  However,  this  is  a  contradiction  to 
the  assumption  that  r  >  0  and  that  cr  is  an  optimal  strategy.  I 

It  follows,  from  Lemma  5  that  in  every  value  class  if  the  boundary  prob¬ 
abilistic  states  are  assumed  to  be  winning  for  player  1,  then  player  1  wins 
almost-surely.  We  call  such  an  almost-sure  winning  strategy  as  conditional 
almost-sure  winning  strategy. 

Definition  5  (Qualitative  optimal  strategy)  A  strategy  a  is  qualitative 
optimal  for  player  1,  for  an  uj-regular  objective  if  the  following  conditions 
hold: 


•  For  every  state  s  G  {{^)) almost  i^)  the  strategy  a  is  almost-sure  winning. 
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•  For  every  state  s  G  VC(r)  such  that  0  <  r  <  1,  there  is  a  constant 
c  >  0  such  that 

inf  Pr^’^($)  >  c.  ■ 

ttGII 

Lemma  6  Consider  a  strategy  a,  and  an  uj-regular  objective  such  that  a 
is  almost-sure  winning  from  every  state  s  G  {{^)) almost  d'lT'd  a  is  a  condi¬ 

tional  almost-sure  winning  strategy  from  every  state  s  in  S\ll2j)  aimost{^\^)  ^ 
then  a  is  qualitative  optimal  for 

Proof.  Since  the  strategy  a  is  conditional  almost-sure  winning  it  follows 
that  any  strategy  tt  that  is  optimal  against  a  the  play  reaches  the 
boundary  probabilistic  states  with  positive  probability,  for  s  G  VC(r)  and 
r  >  0.  Prom  every  boundary  probabilistic  state  the  game  proceeds  to  a 
higher  value  class  with  positive  probability.  By  an  easy  induction  on  the 
number  of  value  classes  it  follows  that  from  every  state  in  S\{{2))  almost  (^\^) 
the  game  reaches  {{1}) almost  i^)  with  positive  probability.  Since  a  is  almost- 
sure  winning  for  every  state  s  G  {{^))  almost  i^)  if  follows  that  a  is  qualitative 
optimal.  I 

Definition  6  (Locally  optimal  strategies)  A  strategy  a  is  locally  opti¬ 
mal  for  an  ut-regular  objective  $  if  for  all  w  ^  S*  and  s  ^  Si  we  have 
a{w  ■  s)(t)  =  0  if  ((l}}yal^(t)  <  ((l}}val^(s)-  ■ 

Note  that  by  definition  a  conditional  almost-sure  winning  strategy  is 
locally  optimal.  The  following  proof  is  similar  to  the  proof  of  Lemma  5.3 
of  [4]. 

Lemma  7  Consider  a  2^l2-player  game  G  with  an  ut-regular  objective  $  for 
player  1.  Let  a  be  a  memoryless  strategy  such  that  a  is  qualitative  optimal 
and  locally  optimal  for  $.  Then  a  is  an  optimal  strategy  for  $. 

Proof.  Given  cr  is  a  memoryless  the  game  G(j  is  a  player-2  MDP.  Since  a 
is  a  qualitative  optimal  strategy  it  follows  that  for  every  state  s  G  VC(r), 
for  r  >  0,  for  all  strategy  tt  of  player  2  we  have  Prg’^($)  >  c,  for  some 
constant  c.  Hence,  the  set  of  almost-sure  winning  states  for  player  2  in  G(j 
coincide  with  the  set  of  almost-sure  winning  states  in  G.  Let  us  denote  by 
W2  the  set  of  almost-sure  winning  states  for  player  2  in  G  and  i.e., 
W2  =  {{‘2)) almost \  ^)-  If  follows  from  the  analysis  of  MDPs  that  in  the 
game  G^-,  for  all  state  s  we  have  {{2))„a;(f2  \  $)(s)  =  {{2))„a;(Reach(VP2))('S). 
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By  [6,  5],  the  values  are  the  unique  solution  to  the  linear  program  consisting 
in  minimizing  Xlses  subject  to: 

Vs  G  Sq  :  Xs  =  Y.teE{s)^f  e  S  :  rr*  >  0 

Vs  G  5i  :  Xs  =  J2teE{s)  '  cr(s)(i)  Vs  G  VF2  :  Xs  =  l 

Vs  G  52,  Vt  G  E{s)  :  Xg  >  Xf 

Let  us  denote  by  x*  the  optimal  solution  of  the  above  liner  program.  The 
local  optimality  of  the  strategy  a  ensures  that  for  every  state  s  G  5i, 

Xg  =  {{2))yali^  \  ^)('S),  satisfy  the  constraints  of  the  linear  program.  More¬ 
over,  Xg  =  {{2))yaii^  \  satisfy  the  constraints  for  all  state  s  G  ^2  U  Sq. 

Hence,  Xg  =  {{2))„a;(ff  \  $)  is  is  a  feasible  solution  of  the  linear  pro¬ 
gram.  Since  the  above  linear  program  is  a  minimization  problem  we  have 
X*  <  Xg  =  {{2))yaii^  \  ^)('S)  for  all  s  G  5.  It  follows  that  in  the  MDP 
Gcr  we  have  sup^^nP^’K^  \  ^)  <  {{‘2‘))vali^  \  ^)('S)-  Hence  it  follows  that 
inf^enP<’''(^)  =  1  -  sup^^n  \  ^)  >  1  “  P))val{^  \  ^)(5)  = 

{{l))ua;(^)('S)-  This  implies  that  a  is  an  optimal  strategy  for  player  1  in 
G.  I 

Observe  that  arguments  similar  to  the  arguments  of  Lemma  7  can  be 
extended  to  the  synchronous  product  of  the  game  graph  G  with  any  finite 
memory  M.  Hence,  the  proof  of  Lemma  7  can  be  easily  extended  for  finite- 
memory  strategy  a  in  place  of  memoryless  strategy  a.  This  gives  us  the 
following  general  Theorem. 

Theorem  5  If  a  family  Yf  C  of  strategies  suffices  for  almost-sure  win¬ 
ning  with  respect  to  an  u-regular  objective  $  on  2^l2-player  game  graphs, 
then  Yf  suffices  for  optimality  with  respect  to  objective  $  on  2^/2-player 
game  graphs. 

Since  pure  memoryless  suffices  for  almost-sure  winning  with  respect  to 
Rabin  objectives  on  2  Y2-player  game  graphs  (Theorem  4)  the  following  The¬ 
orem  is  immediate  from  Theorem  5. 

Theorem  6  The  family  of  pure  memoryless  strategy  suffices  for  opti¬ 
mality  with  respect  to  all  Rabin  objectives  on  2^l2-player  game  graphs. 

Theorem  7  Given  a  2^l2-player  game  graph  G,  an  objective  $  for  player  1, 
a  state  s  ^  S  and  a  rational  r  G  M,  the  complexity  of  determining  whether 
{{l))«a;(^)(s)  >r  is  as  follows. ■ 

1.  NP-complete  if  ^  is  a  Rabin  objective. 
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2.  coNP-complete  if  ^  is  a  Streett  objective. 

3.  [4,  18]  NP  n  coNP  if  ^  is  a  parity  objective. 

Proof. 

1.  Let  G  be  a  2Y2-player  game  with  a  Rabin  objective  Rabin(P)  for 
player  1.  Given  a  pure  memoryless  optimal  strategy  a  for  player  1 
the  game  G(j  is  a  player-2  MDP  with  Streett  objective  for  player  2. 
Since  the  values  of  MDPs  with  Streett  objective  can  be  computed  in 
polynomial  time  (Section  3)  the  problem  is  in  NP.  The  NP-hardness 
proof  follows  from  the  fact  the  2-player  games  with  Rabin  objectives 
are  NP-hard  [12,  23]. 

2.  Follows  immediately  from  the  fact  that  Street  objectives  are  comple¬ 
mentary  to  Rabin  objectives. 

3.  Follows  from  the  previous  two  completeness  result,  as  a  parity  objec¬ 
tive  is  both  a  Rabin  objective  and  a  Streett  objective.  I 
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